Randomized Embedding Cluster ensembles for gene expression data analysis
نویسندگان
چکیده
In the framework of unsupervised pattern analysis of gene expression, the high dimensionality of the data as well as the accuracy of clustering algorithms and the reliability of the discovered clusters are critical problems. We propose and analyze an algorithmic scheme for unsupervised cluster ensembles, where the dimensionality reduction is obtained by means of randomized embeddings with low distortion. Multiple "base" clusterings are performed on random subspaces, approximately preserving the distances between the projected examples. In this way the accuracy of each "base" clustering is maintained, and the diversity between them is improved. By combining the multiple clusterings, we can enhance the overall accuracy and the reliability of the discovered clusters, as shown by our experimental results with high-dimensional gene expression
منابع مشابه
Modification of the Fast Global K-means Using a Fuzzy Relation with Application in Microarray Data Analysis
Recognizing genes with distinctive expression levels can help in prevention, diagnosis and treatment of the diseases at the genomic level. In this paper, fast Global k-means (fast GKM) is developed for clustering the gene expression datasets. Fast GKM is a significant improvement of the k-means clustering method. It is an incremental clustering method which starts with one cluster. Iteratively ...
متن کاملP-73: Effect of Donor Age on The Expression Stability of GAPDH as A ReferenceGene for Gene Expression Analysis ofEquine Adipose-Derived Mesenchymal Stem Cells
Background: Adipose tissue is a main source for isolation of equine mesenchymal stem cells (MSCs) at different ages. It seems that characteristics of adipose-derived MSCs especially gene expression profile are changing along with age increase. A proper reference gene is required for normalizing data in gene expression analysis by qRT-PCR. This study aimed to evaluate whether GAPDH has a stable ...
متن کاملO-30: Comparing Expression Patterns of Endometrial Genes in Implantation Failures and Recurrent Miscarriages with Fertile Couples Following ICSI/IVF Using in Silico Analysis
Background: To screen and diagnose patients with recurrent abortions and implantation failure after IVF/ICSI, differentially expressed genes of endometrium through DNA microarrays were monitored. Materials and Methods: Microarray expression profile of GSE26787 dataset from GEO database was used to analyze gene expression profiles of 15 endometrial biopsy samples- five from control fertile (CF) ...
متن کاملFeature Selection and Classification of Microarray Gene Expression Data of Ovarian Carcinoma Patients using Weighted Voting Support Vector Machine
We can reach by DNA microarray gene expression to such wealth of information with thousands of variables (genes). Analysis of this information can show genetic reasons of disease and tumor differences. In this study we try to reduce high-dimensional data by statistical method to select valuable genes with high impact as biomarkers and then classify ovarian tumor based on gene expression data of...
متن کاملO-3: Drug Repositioning by Merging Gene Expression Data Analysis and Cheminformatics Target Prediction Approaches
The transcriptional responses of drug treatments combined with a protein target prediction algorithm was utilised to associate compounds to biological genomic space. This enabled us to predict efficacy of compounds in cMap and LINCS against 181 databases of diseases extracted from GEO. 18/30 of top drugs predicted for leukemia (e.g. Leflunomide and Etoposide) and breast cancer (e.g. Tamoxifen a...
متن کامل